Kernel CCA for multi-view learning of acoustic features using articulatory measurements
نویسندگان
چکیده
We consider the problem of learning transformations of acoustic feature vectors for phonetic frame classification, in a multi-view setting where articulatory measurements are available at training time but not at test time. Canonical correlation analysis (CCA) has previously been used to learn linear transformations of the acoustic features that are maximally correlated with articulatory measurements. Here, we learn nonlinear transformations of the acoustics using kernel canonical correlation analysis (KCCA). We present an incremental SVD approach that makes the KCCA computations feasible for typical speech data set sizes. In phonetic frame classification experiments on data drawn from the University of Wisconsin X-ray Microbeam Database, we find that KCCA provides consistent improvements over linear CCA, as well as over single-view unsupervised dimensionality reduction.
منابع مشابه
Using articulatory measurements to learn better acoustic features
We summarize recent work on learning improved acoustic features, using articulatory measurements that are available for training but not at test time. The goal is to improve recognition using articulatory information, but without explicitly solving the difficult acoustics-to-articulation inversion problem. We formulate the problem as learning a (linear or nonlinear) transformation of standard a...
متن کاملMulti-view Acoustic Feature Learning Using Articulatory Measurements
We consider the problem of learning a linear transformation of acoustic feature vectors for phonetic frame classification, in a setting where articulatory measurements are available at training time. We use the acoustic and articulatory data together in a multi-view learning approach, in particular using canonical correlation analysis to learn linear transformations of the acoustic features tha...
متن کاملAcoustic feature learning using cross-domain articulatory measurements
Previous work has shown that it is possible to improve speech recognition by learning acoustic features from paired acoustic-articulatory data, for example by using canonical correlation analysis (CCA) or its deep extensions. One limitation of this prior work is that the learned feature models are difficult to port to new datasets or domains, and articulatory data is not available for most spee...
متن کاملMultiview Representation Learning via Deep CCA for Silent Speech Recognition
Silent speech recognition (SSR) converts non-audio information such as articulatory (tongue and lip) movements to text. Articulatory movements generally have less information than acoustic features for speech recognition, and therefore, the performance of SSR may be limited. Multiview representation learning, which can learn better representations by analyzing multiple information sources simul...
متن کاملIntra-View and Inter-View Supervised Correlation Analysis for Multi-View Feature Learning
Multi-view feature learning is an attractive research topic with great practical success. Canonical correlation analysis (CCA) has become an important technique in multi-view learning, since it can fully utilize the inter-view correlation. In this paper, we mainly study the CCA based multi-view supervised feature learning technique where the labels of training samples are known. Several supervi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012